Picture for Jinpeng Chen

Jinpeng Chen

X-Stream: Exploring MLLMs as Multiplexers for Multi-Stream Understanding

Add code
Jun 01, 2026
Viaarxiv icon

UI-KOBE: Knowledge-Oriented Behavior Exploration for Lightweight Graph-Guided GUI Agents

Add code
May 28, 2026
Viaarxiv icon

OmniInteract: Benchmarking Real-World Streaming Interaction for Real-Time Omnimodal Assistants

Add code
May 26, 2026
Viaarxiv icon

See the Forest for the Trees: Loosely Speculative Decoding via Visual-Semantic Guidance for Efficient Inference of Video LLMs

Add code
Apr 07, 2026
Viaarxiv icon

AURA: Always-On Understanding and Real-Time Assistance via Video Streams

Add code
Apr 05, 2026
Viaarxiv icon

CoVe: Training Interactive Tool-Use Agents via Constraint-Guided Verification

Add code
Mar 02, 2026
Viaarxiv icon

PhoStream: Benchmarking Real-World Streaming for Omnimodal Assistants in Mobile Scenarios

Add code
Jan 30, 2026
Viaarxiv icon

VP-Bench: A Comprehensive Benchmark for Visual Prompting in Multimodal Large Language Models

Add code
Nov 14, 2025
Viaarxiv icon

From Exploration to Exploitation: A Two-Stage Entropy RLVR Approach for Noise-Tolerant MLLM Training

Add code
Nov 11, 2025
Figure 1 for From Exploration to Exploitation: A Two-Stage Entropy RLVR Approach for Noise-Tolerant MLLM Training
Figure 2 for From Exploration to Exploitation: A Two-Stage Entropy RLVR Approach for Noise-Tolerant MLLM Training
Figure 3 for From Exploration to Exploitation: A Two-Stage Entropy RLVR Approach for Noise-Tolerant MLLM Training
Figure 4 for From Exploration to Exploitation: A Two-Stage Entropy RLVR Approach for Noise-Tolerant MLLM Training
Viaarxiv icon

SEFE: Superficial and Essential Forgetting Eliminator for Multimodal Continual Instruction Tuning

Add code
May 05, 2025
Viaarxiv icon